Repreceive the urban environment by social media posts

–A possible approach to quantify massive qualitative urban experience


Supervisor: Carole Voulgaris

Abstract

Understanding and reflecting urban dwellers’ happiness, and perception of urban space and life in urban planning has long been a challenge in China. The Chinese government has progressively brought attention to public participation and views, which had previously been ignored. However, government approaches for fostering public engagement remain scarce, while enthusiasm among individuals to express feelings and comments on government planning is lacking. In this context, my research project proposes measuring individual citizens’ emotions and characteristics of the built environment,the proximity to various urban amenities, to explore the relationships between citizens’ sentiments, urban space, and urban activities. This approach can be applied to gather public feelings in the urban planning process through platforms, as a way to quantify the quality of urban life. By establishing a quantitative urban model, the happiness of urban residents(the sentiment of social media post) can become an indicator and standard as a way of bottom-up public participation, guiding planners’ decisions to design a healthy, sustainable and people-centered city.

Introduction

In China, the early stage of urbanization was subject to the national ambition and development, whose aim at boosting the economic production and consumption. As such, hundreds and thousands of cities and towns were built to serve for this purpose that plays an important role in improving the national economy based on the scarification of natural resource and incline of factory production. However, since the decreasing demand for industrial production and the arise of urban awareness of residents happiness, Chinese government increasingly focus on citizen wellbeing, urban governance and the urban environment.

The new urban policies in China shifted the emphasis of urbanization away from economic development and toward human-centric development, improving residents’ wellbeing and building new ecological smart cities. However, the evaluation of residents’ wellbeing and urban planning process remains top-down and entirely conducted by governments and experts. The qualitative surveys and reports could only cover a small proportion of population, and it becomes even harder and time-consuming as the population in cities and towns reached 848 millions.

With the introduction of social media data and machine learning technologies, new methods for studying urban spatial patterns and residents’ life have emerged. In 2017, there were more than 753 million people have access to mobile internet, while 68% of them frequently use social media platform. The AI-based technology such as sentiment analysis could be used to extract individual feelings from text-based information, as a form of public perception of urban space, contributing to bottom-up engagement and people-oriented urban planning.

Literature Review

Wellbeing as a Measure of Health and the Built environment

Scientists have historically measured well-being using objective indicators (e.g., GDP, health, employment, literacy, poverty) and increasing measured subjective well-being that influences individual life. Modern measures of well-being that account for cognitive evaluations (i.e., evaluative well-being) and reactions to experiences (i.e., experienced well-being) have therefore become the “currency of a life that matters” (Rath et al., 2010). As the concept of well-being develops, the indices including physical health, mental health, air quality and more are increasingly used, implying a strong relationship between health and residents’ well-being (Diener et al., 1999; Lawless & Lucas, 2010). Some studies found that population density may affect well-being on the city level (Florida et al., 2013). Mouratidis (2018) argues that compact urban form with better public transport, accessibility, the mix of land uses, and density positively influences neighborhood well-being. Social and human capital, considered significant drivers of urban well-being, can be affected by safety, educational opportunities, and access to arts and culture (Leyden et al., 2011; Florida et al., 2013). Other aspects of urban infrastructure (such as roads and transportation) impact commute time and connectedness, both of which are related to happiness (Yin et al., 2021; Gim, 2021).

Big data, Social media and Sentiment analysis

Recently, the availability of mobile information, big urban data and machine learning technology has significantly enhanced urban research methods, particularly in terms of the dynamic spatial and temporal relationships between human behaviour and urban research (Wu, Ye, Ren, & Du, 2018). The invitation of various big data could change China’s top-down urban planning process by bringing individual information into the discussion of urban space and quality. With the rise of social media and machine learning, sentiment analysis became a field of study that mines public opinions, measures subjective happiness and relates them to different research areas. The previous studies applied Twitter, Sina Weibo data (Chinese Twitter) and Dazhong dianping (Chinese Yelp) to examine the relationship between urban form, spatial quality, and public comments, such as the relationship between quality of urban parks and travellers’ behaviour and words; urban transportation and commercial facilities (Li et al., 2018; Ta et al., 2020; Yao et al., 2019; Plunz et al., 2019). Some research further studied the difference and distribution of residents’ emotions regarding gender, time-dimension and different urban facilities (Ma et al., 2020; Zhen et al., 2018).

Quantitative urban measure of the built environment

The measurement of the built environment is constructed by a variety body of indices to address different urban issues. Cervero and Kockelman’s developed initial “three Ds” (density, design, and diversity) in 1997 to evaluate the existing urban built environment. Edwing et al. expanded on this concept by adding two Ds (destination accessibility and transportation distance) (Ewing & Cervero, 2001; Ewing et al., 2009). More Ds were added afterwards to reflect the changing built environment, such as Demand management and Demographics (Ewing & Cervero, 2010). Scholars have modified the list of variables based on these quantitative frameworks to comprehensively examine the built environment while addressing various urban issues and topics. Some research used relative entropy to discern compactness from sprawl in the built environment (Tsai, 2005). Others used a multi-metric urban intensity index at a metropolitan scale, which included land use, infrastructure, and landscape variables in addition to density and compactness (Tate et al., 2005). More recent studies, especially in the Chinese context, Rowe et al. (2014) proposed the measurement of urban intensity from variables of compactness, density, diversity, and connectivity, aiming at revealing the resource distribution, transportation efficiency, and social integration in both cities and optimize the urban performance (Rowe et al., 2014). Later, Guan and Rowe (2016) evaluated the spatial structure of small towns in Zhejiang Province using similar urban intensity characteristics, such as density, compactness, diversity, and accessibility.

Research proposal

The research aims to apply sentiment analysis to quantify qualitative public feelings from text-based information as a representation of individual real-time happiness, and develop a multivariate model to explore the relationship between the individual feelings from social media and the built environment in China. Based on the model, this research compare existing condition and the planned condition of Jiaxing city in China, and test different planning scenarios to explore the possible changes of individual feeling when using social media platform.

Methodology

Site

The study area may be Jiaxing city in China’s Zhejiang province. Jiaxing is a significant city that is part of the Yangtze River Delta city cluster and the Shanghai metropolitan area. It is located in close proximity to the two major cities of Shanghai and Hangzhou. It is a small tourist city with two counties, 44 towns, and 809 administrative villages that has been designated as a national key program.

Data collection and preprocess

For this research, social media data (Weibo posts) was collected from Sina Weibo, which is one of the most used online social platform in China. The collected Weibo posts are restricted with the tag “Jiaxing” in 2018, which limited the posts related to the case city. The untreated data included a large amount of advertisements, which have been removed by identifying certain keywords such as “advertisement”, “cost performance”, etc. The urban amenity data is collected from Gaode Map as spatial POI (points of interest). The regulatory detailed planning (RDP) documents from 2017-2020 issued by the local government in Jiaxing are collected from the local government office.

Sentiment analysis of social media post

To study individual happiness and its response to different built environments, this research collects within one year as a primary resource via the API of Sina Weibo. Individual emotional preference (positive probability of individual post ) would be defined as individual real-time happiness toward the built environment. The Baidu AI platform has allowed researchers to analyze public sentiment via its well-developed emotion dictionaries and its trained machine learning model for decades in the Chinese context. Thus, the sentiment analysis of individual posts will be conducted by accessing a machine learning model from the Baidu AI platform. The return value of individual post contains three parts: sentiment category, positive probability and negative probability. In this research, the positive probability is used to represent the possible level of happiness of individual post.



Proximity to urban amenities

In this research, proximity is defined by the accessibility of urban activities and amenities by walking and by characteristics of pedestrian networks. The amenities dataset is collected as points of interest (POIs) from Gaodu Map in China, which is one of the most widely used digital navigation systems in China. The urban networks are extracted from the OpenStreet Map (OSM). Accessibility will be calculated by the R5 package embedded in the R studio.

Correlation between individual sentiment and the built environment

To understand how the urban environment would affect individual sentiment from social media, a series of regression models between sentiment results and the proximity to urban amenities were conducted.To further improve the interpretation of regression model, the regression analysis between sentiment result and the different proximity based on various walking distance were applied. In addition, adding the time, weather and workday variables helps improve the model and further understand which variable might affect individual sentiment other than the built environment.

Prediction of sentiment changes based on the regression model

By applying the established regression model, the positive probability of individual post can be simulated by providing a set of proximity to urban amenities which can be calculated by the geo-locations of them. As such, this research provides three different planning for scenarios: the official planning documents, educational city (double the number of school) and tourism city (replace the school with greenspace). Each scenario associates with individual set of urban amenities, simulating different individual sentiment map of Jiaxing. By comparing the scenarios and the exiting condition, we could better understand how the built environment may affect individual feelings in social media, and more importantly, design a better city of citizen.

Results

Regression analysis on the relationship between individual sentiment and the prolixity to urban amenities

The regression model shows that proximity to restaurant, busstop, green space, leisure space are positively related to individual sentiment on weibo post; while the proximity to school and small business are negatively related to individual sentiment. In addition, adding the time, weather and workday variables, I found that the Time of the weibo post is statistic significantly related to individual sentiment, while weather and workday variables are not related to individual sentiment.

The threshold tests indicate the best fits of walking distance to different amenities for the regression models: the proximity to restaurant within 15 mins, the proximity to school within 25 mins, the proximity to green space within 25 mins, the proximity to hospital within 25 mins, the proximity to bus stops within 70 mins, the proximity to leisure space within 20 mins, the proximity to small business within 15 mins, the proximity to shopping center within 25 mins.

Scenario1

The planning scenario 1 of Jiaxing is based on the offical planning document issued by local government. Base on the changes of residential and business zoning in Jiaxing, the simulation of new restaurant locations are generated by assuming the same density of restaurant in these zoning. After comparing of the averages of public sentiment between the planned condition and existing condition, it was surprising to find that the simulated future sentiment was less than the existing sentiment result at -0.05382951. It suggests that the future Jiaxing might not improve residents’ happiness in social media.



Scenario2

Scenario 2 intends to simulate an educational Jiaxing city by doubling the number of school spreading out the city. The original schools remind the same locations, while randomly generating same amount of schools spreading out the city. After comparing of the averages of public sentiment between the scenario 2 and existing condition, the simulated sentiment was less than the existing sentiment result at -0.0002921917 due to the negative relationship between school and public sentiment. This finding remains questionable; it is understandable that people might not be happly to stay at school, however, it should play an important role in long-term wellbeing.

Scenario13

Scenario 3 intends to simulate an green Jiaxing city by turning school into green space. The planning strategy based on the negative correlation between school and public sentiment, while a positive correlation between green space and public sentiment, assuming a happier Jiaxing.The finding supports such assumption: the simulated average sentiment was more than the existing sentiment result at 0.005517993.

Limitation

Although the relationships between public sentiment and the proximity to amenities are indicated by a set of regression models, there are no proof to suggest any causation. In addition, in this research, author assumed the public sentiment from social media could represent individual real-time happiness, which remains uncertain awaiting a further examination. What is more, due to the limited urban variables, the confidence level of regression model is relatively low at 2%, which could be used to study the changes of individual sentiment but is incapable to simulate a convincing results. Last but not least, the weibo post could only represent the users who mostly age from 16-40, which are around 30% of the total population who have access to mobile internet.

Contribution

This research shows certain statistic significant relationship between public sentiment and the proximity to amenities, which proves author’s hypothesis that the built environment might affect individual feeling. It shows a potential to help planner understand which characteristics of the built environment may improve residents’ happiness. Importantly, this research establishes a framework to quantify urban experience from text-based information and build a simulation model based on that. Imagine a new platform that allows all citizen to post comments and feelings in the city. The more precise and targeted data could be used in this framework and help build a bottom-up participation in city management and planning. As such, we could see the online platform as a new form of infrastructure of urban life, which might bring a great potential to design a more people-oriented city or a new urban form.

Reference

  1. Cervero, R., & Kockelman, K. (1997). Travel demand and the 3Ds: Density, diversity, and design. Transportation research part D: Transport and environment, 2(3), 199-219.
  2. Conway, J. R., Lex, A., & Gehlenborg, N. (2017). UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics.
  3. Diener, E., Suh, E. M., Lucas, R. E., & Smith, H. L. (1999). Subjective well-being: Three decades of progress. Psychological bulletin, 125(2), 276.
  4. Ewing, R., & Cervero, R. (2001). Travel and the built environment: a synthesis. Transportation research record, 1780(1), 87-114.
  5. Ewing, R., & Cervero, R. (2010). Travel and the built environment: A meta-analysis. Journal of the American planning association, 76(3), 265-294.
  6. Ewing, R., Greenwald, M. J., Zhang, M., Walters, J., Feldman, M., Cervero, R., & Thomas, J. (2009). Measuring the impact of urban form and transit access on mixed use site trip generation rates—Portland pilot study. US Environmental Protection Agency, Washington, DC.
  7. Florida, R., Mellander, C., & Rentfrow, P. J. (2013). The happiness of cities. Regional studies, 47(4), 613-627.
  8. Guan, CH. & Rowe, P.G. (2016). The concept of urban intensity and China’s townization policy: Cases from Zhejiang Province. Cities. 55. 22-41.
  9. Gim, T. H. T. (2021). Comparing Happiness Determinants for Urban Residents A Partial Least Squares Regression Model. International Review for Spatial Planning and Sustainable Development, 9(2), 24-40.
  10. Hogertz, C. (2010). Emotions of the urban pedestrian: sensory mapping. Pedestrians’ quality needs, 31, 31-52.
  11. Lawless, N. M., & Lucas, R. E. (2011). Predictors of regional well-being: A county level analysis. Social Indicators Research, 101(3), 341-357.
  12. Leyden, K. M., Goldberg, A., & Michelbach, P. (2011). Understanding the pursuit of happiness in ten major cities. Urban affairs review, 47(6), 861-888.
  13. Li, SJ., Ma, S., Zhang M., Long Y. (2018). Muiti-scale Evaluation of Urban Green Space Based on Muti-source New Data:Exploration of Main Cities in China. Landscape Architecture.
  14. Ma, Y., Ling, C., & Wu, J. (2020). Exploring the Spatial Distribution Characteristics of Emotions of Weibo Users in Wuhan Waterfront Based on Gender Differences Using Social Media Texts. ISPRS International Journal of Geo-Information, 9(8), 465.
  15. Mouratidis, K. (2018). Rethinking how built environments influence subjective well-being: A new conceptual framework. Journal of Urbanism: International Research on Placemaking and Urban Sustainability, 11(1), 24-40.
  16. OECD Better Life Index. How’s Life? 2020 Measuring Well-Being; OECD Publishing: Paris, France, 202Jiang, GH., Ma, WQ., Wang, DQ., Zhou, DY., Zhang, RJ., & Zhou, T. (2017). Identifying the internal structure evolution of urban built-up land sprawl (UBLS) from a composite structure perspective: A case study of the Beijing metropolitan area, China. Land Use Policy, 62, 258-267.
  17. Orii, L., Alonso, L., & Larson, K. (2020). Methodology for Establishing Well-Being Urban Indicators at the District Level to be Used on the CityScope Platform. Sustainability, 12(22), 9458.
  18. Plunz, R. A., Zhou, Y., Vintimilla, M. I. C., Mckeown, K., Yu, T., Uguccioni, L., & Sutto, M. P. (2019). Twitter sentiment in New York City parks as measure of well-being. Landscape and urban planning, 189, 235-246.
  19. Rowe, P. G., & Kan, H. Y. (2014). Urban Intensities: Contemporary Housing Types and Territories. Birkhäuser.
  20. Rath, T., Harter, J. K., & Harter, J. (2010). Wellbeing: The five essential elements. Simon and Schuster.
  21. Song, Y., & Gerrit-Jan, K. (2004). Measuring urban form. American Planning Association. Journal of the American Planning Association, 70, 210–225.
  22. Sun, S. (2017). On Urban Planning System in China, URP 1, 16-25.
  23. Shen, Y., & Karimi, K. (2016). Urban function connectivity: Characterisation of functional urban streets with social media check-in data. Cities, 55, 9-21.
  24. Tsai, Y. (2005). Quantifying urban form: Compactness versus “sprawl”. Urban Studies, 42(1), 141–161.
  25. Tate, C., Cuffney, T., McMahon, G., Giddings, E., Coles, J., Zappia, H. (2005). Use of an urban intensity index to assess urban effects on streams in three contrasting environmental settings. In Effects of urbanization on stream ecosystems. Symposium 47, 291–315. American Fisheries Society, Bethesda, Maryland, USA.
  26. Teller, C. & Reutterer, T. (2008). The Evolving Concept of Retail Attractiveness: What Makes Retail Agglomerations Attractive When Customers Shop at Them? Journal of Retailing and Consumer Services, 15(3), 127-143.
  27. Wu, C., Ye, X., Ren, F., & Du, Q. (2018). Check-in behaviour and spatio-temporal vibrancy: An exploratory analysis in Shenzhen, China. Cities, 77, 104-116.
  28. Wu, Z., & Ye, Z. (2016). Research on urban spatial structure based on Baidu heat map: A case study on the central city of Shanghai. City Planning Review, 40(4), 33–40.
  29. Yin, C., & Shao, C. (2021). Revisiting commuting, built environment and happiness: New evidence on a nonlinear relationship. Transportation Research Part D: Transport and Environment, 100, 103043.
  30. Yao, QA., Guan, TW., Shi, J. (2019). Coupling Development of Rail Transit and Commercial Facilities Based on Public Comment Data:Taking Tianjin as an Example. Journal of Tianjin Chengjian University.
  31. Zhang, D. G. (2019). The theoretical and practical research on Small Towns with Chinese Characteristics. People Press.
  32. Zhang, P., Yang, D., Qin, M., & Jing, W. (2020). Spatial heterogeneity analysis and driving forces exploring of built-up land development intensity in Chinese prefecture-level cities and implications for future Urban Land intensive use. Land Use Policy, 99, 104958.
  33. Zhen, F., Tang, J., & Chen, Y. (2018). Spatial distribution characteristics of residents’ emotions based on Sina Weibo big data: A case study of Nanjing. In Big data support of urban planning and management (pp. 43-62). Springer, Cham.